Here we model an intervention to address malnutrition among school-aged children in urban Hanoi. The model was developed across several iterative workshops in July 2023. These included decision definition, conceptual model development of the selected decision and an initial programming session with preliminary model results. The resulting model was further developed through August to November 2023. A test run of the intervention will be carried out by the Center for the Development of Organic Agriculture (CODAS) under the Association of Organic Agriculture of Vietnam.
The gardens will be existing areas on school grounds. The STEM option gardens will be designed to for teaching about nutrition and STEM (an approach to learning and development that integrates science, technology, engineering and maths) at primary and secondary schools. In the 2nd year the garden is expected to start running well. The 3rd year is when the STEM education plan will be fully running.
We use several different functions to create a working model of the garden intervention.
The chance_event function generates conditional values
with:
\[ occurrence = \begin{cases} rbinom(length(value\_if), 1, chance) & if !one\_draw \\ rbinom(1, 1, chance) & if one\_draw \end{cases} \]
and final values with:
\[ chance\_event = occurrence \cdot value\_if + (1 - occurrence) \cdot value\_if\_not \]
source("functions/vv.R")
The vv function generates synthetic data with varying
means and coefficients of variation with:
These are used to calculate annual means:
If both absolute and relative trends are absent, the annual means remain constant:
\[ annual\_means = var\_mean \]
If an absolute trend is present, annual means are adjusted linearly:
\[ annual\_means = var\_mean + absolute\_trend \cdot (0, 1, \ldots, n-1) \]
If only a relative trend is present, annual means change multiplicatively:
\[ annual\_means = var\_mean \cdot (1 + relative\_trend / 100)^{(0, 1, \ldots, n-1)} \]
Generate random data points based on the specified distribution using calculated annual means:
\[ out \sim \mathcal{N}(annual\_means, \lvert annual\_means \rvert \cdot \frac{var\_CV}{100}) \]
If specified, limit data points to the defined lower and upper limits:
\[ out = \max(out, lower\_limit), \quad out = \min(out, upper\_limit) \]
source("functions/discount.R")
The discounted value of a time series value \(x_t\) at time \(t\) can be calculated using the formula:
\[ \text{Discounted Value}_t = \frac{x_t}{\left(1 + \frac{\text{Discount Rate}}{100}\right)^{t-1}} \]
Where:
\(x_t\) is the value at time \(t\),
\(\text{Discount Rate}\) is the discount rate provided to the function,
\(t\) is the time period (starting from 1).
Net Present Value (NPV) (if
calculate_NPV is set to TRUE): The Net Present
Value of a time series with discounted values can be calculated as the
sum of all discounted values:
\[ \text{NPV} = \sum_{t=1}^{n} \frac{x_t}{\left(1 + \frac{\text{Discount Rate}}{100}\right)^{t-1}} \]
Where \(n\) is the number of time periods in the time series.
source("functions/mcSimulation.R")
##
## Attaching package: 'decisionSupport'
## The following objects are masked _by_ '.GlobalEnv':
##
## discount, vv
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
We use the mcSimulation function to perform Monte Carlo
simulations to estimate model outputs based on provided parameters and a
model function.
The Monte Carlo simulation process generates a set of estimated model outputs based on random input samples, providing a distribution of potential outcomes including:
random input generation with \(x_1, x_2, \ldots, x_n\) as a set of \(n\) random input samples drawn from a given distribution, representing the input parameters of the model.
a model function \(f(x)\) that maps the input data \(x\) to the corresponding model output \(y\). This can be expressed as:
\[ y_i = f(x_i), \quad i = 1, 2, \ldots, n \]
Estimated model outputs \(\hat{y}_1, \hat{y}_2, \ldots, \hat{y}_n\) obtained by applying the model function to the random input samples:
\[ \hat{y}_i = f(x_i), \quad i = 1, 2, \ldots, n \]
# Source our model
source("CODAS_Garden_Model.R")
## Warning: Variable: outside_investment_value distribution: posnorm
## Calculated value of 5%-quantile: 1.77932900574854
## Target value of 5%-quantile: 1
## Calculated cumulative probability at value 1 : 0.0289226319595083
## Target cumulative probability at value 1 : 0.05
## Mean scaled difference: 0.4215474
## Warning in paramtnormci_numeric(p = p, ci = ci, lowerTrunc = lowerTrunc, : Calculated value of 5%-quantile: 1.77932900574854
## Target value of 5%-quantile: 1
## Calculated cumulative probability at value 1 : 0.0289226319595083
## Target cumulative probability at value 1 : 0.05
## Mean scaled difference: 0.4215474
# Ensure consistent results with the random number generator
# not for each 'run' of the MC simulation but for
# consistency each time we call on the simulation
set.seed(1234)
garden_simulation_results <- mcSimulation(
estimate = estimate_read_csv("inputs_school_garden.csv"),
model_function = school_garden_function,
numberOfModelRuns = 1000, #run 1000 times
functionSyntax = "plainNames"
)
## Warning: Variable: outside_investment_value distribution: posnorm
## Calculated value of 5%-quantile: 1.77932900574854
## Target value of 5%-quantile: 1
## Calculated cumulative probability at value 1 : 0.0289226319595083
## Target cumulative probability at value 1 : 0.05
## Mean scaled difference: 0.4215474
## Warning: Calculated value of 5%-quantile: 1.77932900574854
## Target value of 5%-quantile: 1
## Calculated cumulative probability at value 1 : 0.0289226319595083
## Target cumulative probability at value 1 : 0.05
## Mean scaled difference: 0.4215474
The way we present NPV values can influence decision makers. The same information presented in different ways can even lead to different decisions. Here we derive a plot that compares the decision options as pure NPV values.
source("functions/plot_distributions.R")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
plot_distributions(mcSimulation_object = garden_simulation_results,
vars = c("NPV_garden","NPV_garden_STEM", "NPV_no_garden"),
method = 'hist_simple_overlay',
base_size = 7,
x_axis_name = "Comparative NPV outcomes")
Under Prospect Theory the way we present NPV values can influence decision makers - the same information presented in different ways can lead to different decisions. For example, framing a projected NPV gain as a “reduction in potential loss” might make it more attractive to decision makers due to loss aversion.
Here we plot the distribution for the decision and frame the
projected NPV gain for the ‘decision’ (distributions for the two options
with the NPV values of the no garden option subtracted from
those for the garden). If we display this as a “Reduction
in potential loss” it might be more attractive to decision makers due to
loss aversion, i.e. school boards might put more emphasis on avoiding
potential losses than on seeking gains. We can frame our results as a
strategy that minimizes losses rather than one that maximizes gains.
plot_distributions(mcSimulation_object = garden_simulation_results,
vars = c("decision", "decision_STEM"),
method = 'hist_simple_overlay',
base_size = 7,
x_axis_name = "Reduction in potential loss")
Summary of the garden intervention options with gt_plt_summary() from {gtExtras} and with options from {svglite}.
# Subset the outputs from the mcSimulation function (y) to summarize only on the variables that we want.
# names(garden_simulation_results$x)
mcSimulation_summary <- data.frame(garden_simulation_results$x[2:61],
# names(garden_simulation_results$x)
garden_simulation_results$y[1:7])
gt_plt_summary(mcSimulation_summary)
| mcSimulation_summary | ||||||
| 1000 rows x 67 cols | ||||||
| Column | Plot Overview | Missing | Mean | Median | SD | |
|---|---|---|---|---|---|---|
| discount_rate | 0.0% | 6.5 | 6.5 | 0.9 | ||
| size_of_garden | 0.0% | 75.2 | 75.2 | 14.9 | ||
| CV_value | 0.0% | 0.3 | 0.3 | 0.1 | ||
| inflation_rate | 0.0% | 7.5 | 7.5 | 1.5 | ||
| if_students_like | 0.0% | 0.6 | 0.6 | 0.1 | ||
| if_parents_like | 0.0% | 0.7 | 0.7 | 0.1 | ||
| if_community_likes | 0.0% | 0.5 | 0.5 | 0.2 | ||
| if_effective_manage | 0.0% | 0.6 | 0.6 | 0.1 | ||
| if_garden_yield_enough | 0.0% | 0.5 | 0.5 | 0.1 | ||
| if_garden_healthy | 0.0% | 0.7 | 0.7 | 0.1 | ||
| if_teachers_like | 0.0% | 0.5 | 0.5 | 0.2 | ||
| if_effective_teaching | 0.0% | 0.6 | 0.6 | 0.2 | ||
| if_effective_training | 0.0% | 0.5 | 0.5 | 0.2 | ||
| if_offer_green_space | 0.0% | 0.7 | 0.7 | 0.1 | ||
| if_reduce_polution | 0.0% | 0.3 | 0.3 | 0.1 | ||
| if_biophysical_good | 0.0% | 0.3 | 0.3 | 0.1 | ||
| equipment_cost | 0.0% | 74.4 | 74.8 | 15.6 | ||
| construction_cost | 0.0% | 22.6 | 22.6 | 4.8 | ||
| garden_designing_costs | 0.0% | 12.5 | 12.5 | 1.6 | ||
| teacher_training_cost | 0.0% | 12.5 | 12.7 | 4.5 | ||
| school_board_planning | 0.0% | 9.0 | 9.0 | 1.8 | ||
| teaching_equipment | 0.0% | 7.6 | 7.5 | 1.5 | ||
| compost_starting | 0.0% | 7.5 | 7.5 | 1.5 | ||
| worm_starting | 0.0% | 3.4 | 3.4 | 0.9 | ||
| livestock_costs | 0.0% | 3.5 | 3.5 | 0.9 | ||
| if_family_pays_establishment | 0.0% | 0.3 | 0.4 | 0.1 | ||
| establishment_family_portion_paid | 0.0% | 0.3 | 0.3 | 0.1 | ||
| maintaining_labor | 0.0% | 32.6 | 32.4 | 4.6 | ||
| teacher_salary_cost | 0.0% | 24.9 | 25.0 | 3.1 | ||
| teaching_equipment_annual | 0.0% | 7.6 | 7.6 | 1.5 | ||
| teaching_tools | 0.0% | 3.5 | 3.5 | 0.9 | ||
| seed_costs | 0.0% | 1.5 | 1.5 | 0.3 | ||
| fertilizer | 0.0% | 1.5 | 1.5 | 0.3 | ||
| plant_protection | 0.0% | 3.5 | 3.6 | 0.9 | ||
| livestock_maint | 0.0% | 6.0 | 6.1 | 2.4 | ||
| annual_teacher_training | 0.0% | 4.0 | 4.0 | 0.6 | ||
| if_school_has_canteen | 0.0% | 0.3 | 0.3 | 0.1 | ||
| canteen_savings | 0.0% | 7.5 | 7.5 | 1.6 | ||
| sale_of_yield | 0.0% | 20.1 | 20.1 | 6.0 | ||
| extra_cirricular_savings | 0.0% | 60.9 | 61.6 | 24.4 | ||
| formal_edu_savings | 0.0% | 9.2 | 8.8 | 5.8 | ||
| formal_edu_savings_STEM | 0.0% | 60.4 | 60.1 | 24.6 | ||
| outside_investment_value | 0.0% | 33.0 | 23.4 | 32.5 | ||
| outside_investment_value_STEM | 0.0% | 170.0 | 125.1 | 155.6 | ||
| increased_enrollment_value | 0.0% | 9.3 | 8.3 | 6.0 | ||
| increased_enrollment_value_STEM | 0.0% | 50.8 | 49.6 | 27.2 | ||
| if_increase_tuition | 0.0% | 0.0 | 0.0 | 0.0 | ||
| if_increase_tuition_STEM | 0.0% | 0.2 | 0.2 | 0.1 | ||
| tuition_increase | 0.0% | 7.5 | 7.5 | 1.6 | ||
| child_veg_access | 0.0% | 7.5 | 7.5 | 1.5 | ||
| child_healthier_choices | 0.0% | 58.3 | 57.2 | 23.8 | ||
| child_healthier_choices_STEM | 0.0% | 298.1 | 291.7 | 125.4 | ||
| green_space_value | 0.0% | 149.5 | 151.0 | 29.4 | ||
| reduce_polution_value | 0.0% | 15.0 | 14.9 | 3.1 | ||
| school_event_value | 0.0% | 29.6 | 29.5 | 12.0 | ||
| school_event_freq | 0.0% | 6.1 | 6.0 | 2.5 | ||
| value_of_non_garden_land_use | 0.0% | 35.5 | 35.4 | 9.3 | ||
| if_parking | 0.0% | 0.1 | 0.1 | 0.0 | ||
| parking_value | 0.0% | 149.2 | 147.2 | 30.2 | ||
| costs_of_non_garden_land_use | 0.0% | 3.0 | 2.9 | 1.2 | ||
| NPV_garden | 0.0% | 624.8 | 554.3 | 355.1 | ||
| NPV_garden_STEM | 0.0% | 1,485.3 | 1,388.0 | 649.7 | ||
| NPV_no_garden | 0.0% | 327.5 | 247.4 | 277.6 | ||
| decision | 0.0% | 297.3 | 293.1 | 455.5 | ||
| decision_STEM | 0.0% | 1,157.8 | 1,070.2 | 711.4 | ||
| total_costs | 0.0% | 412.8 | 412.4 | 36.3 | ||
| total_costs_STEM | 0.0% | 642.7 | 644.1 | 43.4 | ||
Summary of the savings for the passive education garden option
summary(garden_simulation_results$y$decision)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1491.97 58.95 293.11 297.26 545.09 2376.83
Summary of the savings for the formal STEM education garden option
summary(garden_simulation_results$y$decision_STEM)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -979.1 685.6 1070.2 1157.8 1548.9 4225.0
Summary of costs
summary(garden_simulation_results$y$total_costs)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 266.6 389.5 412.4 412.8 437.0 562.3
summary(garden_simulation_results$y$total_costs_STEM)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 503.1 611.8 644.1 642.7 670.6 816.3
summary(garden_simulation_results$y$Cashflow_garden1)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -134.073 -33.497 6.993 16.995 59.232 383.734
summary(garden_simulation_results$y$Cashflow_garden_STEM1)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -135.31 75.28 148.53 169.54 239.58 715.86
source("functions/plot_cashflow.R")
plot_cashflow(mcSimulation_object = garden_simulation_results,
cashflow_var_name = "Cashflow_garden")
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning: Removed 5 rows containing missing values (`geom_line()`).
## Removed 5 rows containing missing values (`geom_line()`).
Cashflow of the garden option with formal STEM education
source("functions/plot_cashflow.R")
plot_cashflow(mcSimulation_object = garden_simulation_results,
cashflow_var_name = "Cashflow_garden_STEM")
Here we assess value of information with the multi_EVPI
function.
# Subset the outputs from the mcSimulation function (y) by selecting the correct variables be sure to run the multi_EVPI only on the variables that we want. Find them with names(garden_simulation_results$y)
mcSimulation_table <- data.frame(garden_simulation_results$x,
garden_simulation_results$y[1:5])
Value of information for the garden option (no STEM).
source("functions/multi_EVPI.R")
# first_out_var is the first result variable in the table, "NPV_garden" in our case.
evpi <- multi_EVPI(mc = mcSimulation_table, first_out_var = "NPV_garden")
## [1] "Processing 5 output variables. This can take some time."
## [1] "Output variable 1 (NPV_garden) completed."
## [1] "Output variable 2 (NPV_garden_STEM) completed."
## [1] "Output variable 3 (NPV_no_garden) completed."
## [1] "Output variable 4 (decision) completed."
## [1] "Output variable 5 (decision_STEM) completed."
source("functions/plot_evpi.R")
plot_evpi(evpi, decision_vars = "decision")
Value of information for the garden option with formal STEM
education.
# using the results of the same multi_EVPI
plot_evpi(evpi, decision_vars = "decision_STEM")
## Warning: There are no variables with a positive EVPI. You probably do not need
## a plot for that.
We use Projection to Latent Structures model to get some sense of the correlation strength and direction for model variables and our outcome variables.
For passive education garden option
source("functions/pls_model.R")
pls_result <- pls_model(object = garden_simulation_results,
resultName = names(garden_simulation_results$y)[1], # the "NPV_garden"
ncomp = 1)
# read in the common input table
input_table <- read.csv("inputs_school_garden.csv")
# source the plot function
source("functions/plot_pls.R")
plot_pls(pls_result, input_table = input_table, threshold = 0.9)
For school garden with formal STEM education
pls_result_STEM <- pls_model(object = garden_simulation_results,
resultName = names(garden_simulation_results$y)[2], # the "NPV_garden_STEM"
ncomp = 1)
plot_pls(pls_result_STEM, input_table = input_table, threshold = 0.9)
The full repository can be accessed at https://github.com/CWWhitney/nifam_codas_school_garden with the following QR code.